Step Up BT live online career challenge¶

Approach B: Can you help the Data Science team?¶

Phuttachat Treerapee : University of Exeter¶

In [1]:
!pip install lightgbm
!pip install catboost

!pip install inflection
!pip install dython
!pip install shap
Requirement already satisfied: lightgbm in c:\users\ployh\anaconda3\lib\site-packages (3.3.2)
Requirement already satisfied: wheel in c:\users\ployh\anaconda3\lib\site-packages (from lightgbm) (0.37.1)
Requirement already satisfied: numpy in c:\users\ployh\anaconda3\lib\site-packages (from lightgbm) (1.21.5)
Requirement already satisfied: scipy in c:\users\ployh\anaconda3\lib\site-packages (from lightgbm) (1.7.3)
Requirement already satisfied: scikit-learn!=0.22.0 in c:\users\ployh\anaconda3\lib\site-packages (from lightgbm) (1.0.2)
Requirement already satisfied: joblib>=0.11 in c:\users\ployh\anaconda3\lib\site-packages (from scikit-learn!=0.22.0->lightgbm) (1.1.0)
Requirement already satisfied: threadpoolctl>=2.0.0 in c:\users\ployh\anaconda3\lib\site-packages (from scikit-learn!=0.22.0->lightgbm) (2.2.0)
Requirement already satisfied: catboost in c:\users\ployh\anaconda3\lib\site-packages (1.0.5)
Requirement already satisfied: plotly in c:\users\ployh\anaconda3\lib\site-packages (from catboost) (5.6.0)
Requirement already satisfied: numpy>=1.16.0 in c:\users\ployh\anaconda3\lib\site-packages (from catboost) (1.21.5)
Requirement already satisfied: pandas>=0.24.0 in c:\users\ployh\anaconda3\lib\site-packages (from catboost) (1.4.2)
Requirement already satisfied: scipy in c:\users\ployh\anaconda3\lib\site-packages (from catboost) (1.7.3)
Requirement already satisfied: graphviz in c:\users\ployh\anaconda3\lib\site-packages (from catboost) (0.20)
Requirement already satisfied: six in c:\users\ployh\anaconda3\lib\site-packages (from catboost) (1.16.0)
Requirement already satisfied: matplotlib in c:\users\ployh\anaconda3\lib\site-packages (from catboost) (3.5.1)
Requirement already satisfied: python-dateutil>=2.8.1 in c:\users\ployh\anaconda3\lib\site-packages (from pandas>=0.24.0->catboost) (2.8.2)
Requirement already satisfied: pytz>=2020.1 in c:\users\ployh\anaconda3\lib\site-packages (from pandas>=0.24.0->catboost) (2021.3)
Requirement already satisfied: fonttools>=4.22.0 in c:\users\ployh\anaconda3\lib\site-packages (from matplotlib->catboost) (4.25.0)
Requirement already satisfied: packaging>=20.0 in c:\users\ployh\anaconda3\lib\site-packages (from matplotlib->catboost) (21.3)
Requirement already satisfied: cycler>=0.10 in c:\users\ployh\anaconda3\lib\site-packages (from matplotlib->catboost) (0.11.0)
Requirement already satisfied: pyparsing>=2.2.1 in c:\users\ployh\anaconda3\lib\site-packages (from matplotlib->catboost) (3.0.4)
Requirement already satisfied: pillow>=6.2.0 in c:\users\ployh\anaconda3\lib\site-packages (from matplotlib->catboost) (9.0.1)
Requirement already satisfied: kiwisolver>=1.0.1 in c:\users\ployh\anaconda3\lib\site-packages (from matplotlib->catboost) (1.3.2)
Requirement already satisfied: tenacity>=6.2.0 in c:\users\ployh\anaconda3\lib\site-packages (from plotly->catboost) (8.0.1)
Requirement already satisfied: inflection in c:\users\ployh\anaconda3\lib\site-packages (0.5.1)
Requirement already satisfied: dython in c:\users\ployh\anaconda3\lib\site-packages (0.7.1.post3)
Requirement already satisfied: scikit-plot>=0.3.7 in c:\users\ployh\anaconda3\lib\site-packages (from dython) (0.3.7)
Requirement already satisfied: pandas>=1.3.2 in c:\users\ployh\anaconda3\lib\site-packages (from dython) (1.4.2)
Requirement already satisfied: numpy>=1.19.5 in c:\users\ployh\anaconda3\lib\site-packages (from dython) (1.21.5)
Requirement already satisfied: matplotlib>=3.4.3 in c:\users\ployh\anaconda3\lib\site-packages (from dython) (3.5.1)
Requirement already satisfied: seaborn>=0.11.0 in c:\users\ployh\anaconda3\lib\site-packages (from dython) (0.11.2)
Requirement already satisfied: scipy>=1.7.1 in c:\users\ployh\anaconda3\lib\site-packages (from dython) (1.7.3)
Requirement already satisfied: scikit-learn>=0.24.2 in c:\users\ployh\anaconda3\lib\site-packages (from dython) (1.0.2)
Requirement already satisfied: python-dateutil>=2.7 in c:\users\ployh\anaconda3\lib\site-packages (from matplotlib>=3.4.3->dython) (2.8.2)
Requirement already satisfied: kiwisolver>=1.0.1 in c:\users\ployh\anaconda3\lib\site-packages (from matplotlib>=3.4.3->dython) (1.3.2)
Requirement already satisfied: cycler>=0.10 in c:\users\ployh\anaconda3\lib\site-packages (from matplotlib>=3.4.3->dython) (0.11.0)
Requirement already satisfied: packaging>=20.0 in c:\users\ployh\anaconda3\lib\site-packages (from matplotlib>=3.4.3->dython) (21.3)
Requirement already satisfied: pillow>=6.2.0 in c:\users\ployh\anaconda3\lib\site-packages (from matplotlib>=3.4.3->dython) (9.0.1)
Requirement already satisfied: pyparsing>=2.2.1 in c:\users\ployh\anaconda3\lib\site-packages (from matplotlib>=3.4.3->dython) (3.0.4)
Requirement already satisfied: fonttools>=4.22.0 in c:\users\ployh\anaconda3\lib\site-packages (from matplotlib>=3.4.3->dython) (4.25.0)
Requirement already satisfied: pytz>=2020.1 in c:\users\ployh\anaconda3\lib\site-packages (from pandas>=1.3.2->dython) (2021.3)
Requirement already satisfied: six>=1.5 in c:\users\ployh\anaconda3\lib\site-packages (from python-dateutil>=2.7->matplotlib>=3.4.3->dython) (1.16.0)
Requirement already satisfied: threadpoolctl>=2.0.0 in c:\users\ployh\anaconda3\lib\site-packages (from scikit-learn>=0.24.2->dython) (2.2.0)
Requirement already satisfied: joblib>=0.11 in c:\users\ployh\anaconda3\lib\site-packages (from scikit-learn>=0.24.2->dython) (1.1.0)
Requirement already satisfied: shap in c:\users\ployh\anaconda3\lib\site-packages (0.40.0)
Requirement already satisfied: scikit-learn in c:\users\ployh\anaconda3\lib\site-packages (from shap) (1.0.2)
Requirement already satisfied: numpy in c:\users\ployh\anaconda3\lib\site-packages (from shap) (1.21.5)
Requirement already satisfied: scipy in c:\users\ployh\anaconda3\lib\site-packages (from shap) (1.7.3)
Requirement already satisfied: cloudpickle in c:\users\ployh\anaconda3\lib\site-packages (from shap) (2.0.0)
Requirement already satisfied: slicer==0.0.7 in c:\users\ployh\anaconda3\lib\site-packages (from shap) (0.0.7)
Requirement already satisfied: numba in c:\users\ployh\anaconda3\lib\site-packages (from shap) (0.55.1)
Requirement already satisfied: tqdm>4.25.0 in c:\users\ployh\anaconda3\lib\site-packages (from shap) (4.64.0)
Requirement already satisfied: pandas in c:\users\ployh\anaconda3\lib\site-packages (from shap) (1.4.2)
Requirement already satisfied: packaging>20.9 in c:\users\ployh\anaconda3\lib\site-packages (from shap) (21.3)
Requirement already satisfied: pyparsing!=3.0.5,>=2.0.2 in c:\users\ployh\anaconda3\lib\site-packages (from packaging>20.9->shap) (3.0.4)
Requirement already satisfied: colorama in c:\users\ployh\anaconda3\lib\site-packages (from tqdm>4.25.0->shap) (0.4.4)
Requirement already satisfied: setuptools in c:\users\ployh\anaconda3\lib\site-packages (from numba->shap) (61.2.0)
Requirement already satisfied: llvmlite<0.39,>=0.38.0rc1 in c:\users\ployh\anaconda3\lib\site-packages (from numba->shap) (0.38.0)
Requirement already satisfied: pytz>=2020.1 in c:\users\ployh\anaconda3\lib\site-packages (from pandas->shap) (2021.3)
Requirement already satisfied: python-dateutil>=2.8.1 in c:\users\ployh\anaconda3\lib\site-packages (from pandas->shap) (2.8.2)
Requirement already satisfied: six>=1.5 in c:\users\ployh\anaconda3\lib\site-packages (from python-dateutil>=2.8.1->pandas->shap) (1.16.0)
Requirement already satisfied: threadpoolctl>=2.0.0 in c:\users\ployh\anaconda3\lib\site-packages (from scikit-learn->shap) (2.2.0)
Requirement already satisfied: joblib>=0.11 in c:\users\ployh\anaconda3\lib\site-packages (from scikit-learn->shap) (1.1.0)
In [150]:
pip install chart_studio
Collecting chart_studio
  Downloading chart_studio-1.1.0-py3-none-any.whl (64 kB)
Requirement already satisfied: requests in c:\users\ployh\anaconda3\lib\site-packages (from chart_studio) (2.27.1)
Collecting retrying>=1.3.3
  Downloading retrying-1.3.3.tar.gz (10 kB)
Requirement already satisfied: six in c:\users\ployh\anaconda3\lib\site-packages (from chart_studio) (1.16.0)
Requirement already satisfied: plotly in c:\users\ployh\anaconda3\lib\site-packages (from chart_studio) (5.6.0)
Requirement already satisfied: tenacity>=6.2.0 in c:\users\ployh\anaconda3\lib\site-packages (from plotly->chart_studio) (8.0.1)
Requirement already satisfied: certifi>=2017.4.17 in c:\users\ployh\anaconda3\lib\site-packages (from requests->chart_studio) (2021.10.8)
Requirement already satisfied: charset-normalizer~=2.0.0 in c:\users\ployh\anaconda3\lib\site-packages (from requests->chart_studio) (2.0.4)
Requirement already satisfied: idna<4,>=2.5 in c:\users\ployh\anaconda3\lib\site-packages (from requests->chart_studio) (3.3)
Requirement already satisfied: urllib3<1.27,>=1.21.1 in c:\users\ployh\anaconda3\lib\site-packages (from requests->chart_studio) (1.26.9)
Building wheels for collected packages: retrying
  Building wheel for retrying (setup.py): started
  Building wheel for retrying (setup.py): finished with status 'done'
  Created wheel for retrying: filename=retrying-1.3.3-py3-none-any.whl size=11447 sha256=fad6588b27be5fc88cdfd12b48dd109aee4cd8f98cd37d7a72524c966d1c7381
  Stored in directory: c:\users\ployh\appdata\local\pip\cache\wheels\ce\18\7f\e9527e3e66db1456194ac7f61eb3211068c409edceecff2d31
Successfully built retrying
Installing collected packages: retrying, chart-studio
Successfully installed chart-studio-1.1.0 retrying-1.3.3
Note: you may need to restart the kernel to use updated packages.
In [2]:
pip install cufflinks
Requirement already satisfied: cufflinks in c:\users\ployh\anaconda3\lib\site-packages (0.17.3)
Requirement already satisfied: pandas>=0.19.2 in c:\users\ployh\anaconda3\lib\site-packages (from cufflinks) (1.4.2)
Requirement already satisfied: ipython>=5.3.0 in c:\users\ployh\anaconda3\lib\site-packages (from cufflinks) (8.2.0)
Requirement already satisfied: colorlover>=0.2.1 in c:\users\ployh\anaconda3\lib\site-packages (from cufflinks) (0.3.0)
Requirement already satisfied: six>=1.9.0 in c:\users\ployh\anaconda3\lib\site-packages (from cufflinks) (1.16.0)
Requirement already satisfied: ipywidgets>=7.0.0 in c:\users\ployh\anaconda3\lib\site-packages (from cufflinks) (7.6.5)
Requirement already satisfied: setuptools>=34.4.1 in c:\users\ployh\anaconda3\lib\site-packages (from cufflinks) (61.2.0)
Requirement already satisfied: numpy>=1.9.2 in c:\users\ployh\anaconda3\lib\site-packages (from cufflinks) (1.21.5)
Requirement already satisfied: plotly>=4.1.1 in c:\users\ployh\anaconda3\lib\site-packages (from cufflinks) (5.6.0)
Requirement already satisfied: colorama in c:\users\ployh\anaconda3\lib\site-packages (from ipython>=5.3.0->cufflinks) (0.4.4)
Requirement already satisfied: pygments>=2.4.0 in c:\users\ployh\anaconda3\lib\site-packages (from ipython>=5.3.0->cufflinks) (2.11.2)
Requirement already satisfied: stack-data in c:\users\ployh\anaconda3\lib\site-packages (from ipython>=5.3.0->cufflinks) (0.2.0)
Requirement already satisfied: backcall in c:\users\ployh\anaconda3\lib\site-packages (from ipython>=5.3.0->cufflinks) (0.2.0)
Requirement already satisfied: traitlets>=5 in c:\users\ployh\anaconda3\lib\site-packages (from ipython>=5.3.0->cufflinks) (5.1.1)
Requirement already satisfied: jedi>=0.16 in c:\users\ployh\anaconda3\lib\site-packages (from ipython>=5.3.0->cufflinks) (0.18.1)
Requirement already satisfied: matplotlib-inline in c:\users\ployh\anaconda3\lib\site-packages (from ipython>=5.3.0->cufflinks) (0.1.2)
Requirement already satisfied: pickleshare in c:\users\ployh\anaconda3\lib\site-packages (from ipython>=5.3.0->cufflinks) (0.7.5)
Requirement already satisfied: prompt-toolkit!=3.0.0,!=3.0.1,<3.1.0,>=2.0.0 in c:\users\ployh\anaconda3\lib\site-packages (from ipython>=5.3.0->cufflinks) (3.0.20)
Requirement already satisfied: decorator in c:\users\ployh\anaconda3\lib\site-packages (from ipython>=5.3.0->cufflinks) (5.1.1)
Requirement already satisfied: ipykernel>=4.5.1 in c:\users\ployh\anaconda3\lib\site-packages (from ipywidgets>=7.0.0->cufflinks) (6.9.1)
Requirement already satisfied: widgetsnbextension~=3.5.0 in c:\users\ployh\anaconda3\lib\site-packages (from ipywidgets>=7.0.0->cufflinks) (3.5.2)
Requirement already satisfied: jupyterlab-widgets>=1.0.0 in c:\users\ployh\anaconda3\lib\site-packages (from ipywidgets>=7.0.0->cufflinks) (1.0.0)
Requirement already satisfied: nbformat>=4.2.0 in c:\users\ployh\anaconda3\lib\site-packages (from ipywidgets>=7.0.0->cufflinks) (5.3.0)
Requirement already satisfied: ipython-genutils~=0.2.0 in c:\users\ployh\anaconda3\lib\site-packages (from ipywidgets>=7.0.0->cufflinks) (0.2.0)
Requirement already satisfied: jupyter-client<8.0 in c:\users\ployh\anaconda3\lib\site-packages (from ipykernel>=4.5.1->ipywidgets>=7.0.0->cufflinks) (6.1.12)
Requirement already satisfied: nest-asyncio in c:\users\ployh\anaconda3\lib\site-packages (from ipykernel>=4.5.1->ipywidgets>=7.0.0->cufflinks) (1.5.5)
Requirement already satisfied: tornado<7.0,>=4.2 in c:\users\ployh\anaconda3\lib\site-packages (from ipykernel>=4.5.1->ipywidgets>=7.0.0->cufflinks) (6.1)
Requirement already satisfied: debugpy<2.0,>=1.0.0 in c:\users\ployh\anaconda3\lib\site-packages (from ipykernel>=4.5.1->ipywidgets>=7.0.0->cufflinks) (1.5.1)
Requirement already satisfied: parso<0.9.0,>=0.8.0 in c:\users\ployh\anaconda3\lib\site-packages (from jedi>=0.16->ipython>=5.3.0->cufflinks) (0.8.3)
Requirement already satisfied: jupyter-core>=4.6.0 in c:\users\ployh\anaconda3\lib\site-packages (from jupyter-client<8.0->ipykernel>=4.5.1->ipywidgets>=7.0.0->cufflinks) (4.9.2)
Requirement already satisfied: pyzmq>=13 in c:\users\ployh\anaconda3\lib\site-packages (from jupyter-client<8.0->ipykernel>=4.5.1->ipywidgets>=7.0.0->cufflinks) (22.3.0)
Requirement already satisfied: python-dateutil>=2.1 in c:\users\ployh\anaconda3\lib\site-packages (from jupyter-client<8.0->ipykernel>=4.5.1->ipywidgets>=7.0.0->cufflinks) (2.8.2)
Requirement already satisfied: pywin32>=1.0 in c:\users\ployh\anaconda3\lib\site-packages (from jupyter-core>=4.6.0->jupyter-client<8.0->ipykernel>=4.5.1->ipywidgets>=7.0.0->cufflinks) (302)
Requirement already satisfied: fastjsonschema in c:\users\ployh\anaconda3\lib\site-packages (from nbformat>=4.2.0->ipywidgets>=7.0.0->cufflinks) (2.15.1)
Requirement already satisfied: jsonschema>=2.6 in c:\users\ployh\anaconda3\lib\site-packages (from nbformat>=4.2.0->ipywidgets>=7.0.0->cufflinks) (4.4.0)
Requirement already satisfied: pyrsistent!=0.17.0,!=0.17.1,!=0.17.2,>=0.14.0 in c:\users\ployh\anaconda3\lib\site-packages (from jsonschema>=2.6->nbformat>=4.2.0->ipywidgets>=7.0.0->cufflinks) (0.18.0)
Requirement already satisfied: attrs>=17.4.0 in c:\users\ployh\anaconda3\lib\site-packages (from jsonschema>=2.6->nbformat>=4.2.0->ipywidgets>=7.0.0->cufflinks) (21.4.0)
Requirement already satisfied: pytz>=2020.1 in c:\users\ployh\anaconda3\lib\site-packages (from pandas>=0.19.2->cufflinks) (2021.3)
Requirement already satisfied: tenacity>=6.2.0 in c:\users\ployh\anaconda3\lib\site-packages (from plotly>=4.1.1->cufflinks) (8.0.1)
Requirement already satisfied: wcwidth in c:\users\ployh\anaconda3\lib\site-packages (from prompt-toolkit!=3.0.0,!=3.0.1,<3.1.0,>=2.0.0->ipython>=5.3.0->cufflinks) (0.2.5)
Requirement already satisfied: notebook>=4.4.1 in c:\users\ployh\anaconda3\lib\site-packages (from widgetsnbextension~=3.5.0->ipywidgets>=7.0.0->cufflinks) (6.4.8)
Requirement already satisfied: terminado>=0.8.3 in c:\users\ployh\anaconda3\lib\site-packages (from notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.0.0->cufflinks) (0.13.1)
Requirement already satisfied: jinja2 in c:\users\ployh\anaconda3\lib\site-packages (from notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.0.0->cufflinks) (2.11.3)
Requirement already satisfied: Send2Trash>=1.8.0 in c:\users\ployh\anaconda3\lib\site-packages (from notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.0.0->cufflinks) (1.8.0)
Requirement already satisfied: nbconvert in c:\users\ployh\anaconda3\lib\site-packages (from notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.0.0->cufflinks) (6.4.4)
Requirement already satisfied: prometheus-client in c:\users\ployh\anaconda3\lib\site-packages (from notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.0.0->cufflinks) (0.13.1)
Requirement already satisfied: argon2-cffi in c:\users\ployh\anaconda3\lib\site-packages (from notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.0.0->cufflinks) (21.3.0)
Requirement already satisfied: pywinpty>=1.1.0 in c:\users\ployh\anaconda3\lib\site-packages (from terminado>=0.8.3->notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.0.0->cufflinks) (2.0.2)
Requirement already satisfied: argon2-cffi-bindings in c:\users\ployh\anaconda3\lib\site-packages (from argon2-cffi->notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.0.0->cufflinks) (21.2.0)
Requirement already satisfied: cffi>=1.0.1 in c:\users\ployh\anaconda3\lib\site-packages (from argon2-cffi-bindings->argon2-cffi->notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.0.0->cufflinks) (1.15.0)
Requirement already satisfied: pycparser in c:\users\ployh\anaconda3\lib\site-packages (from cffi>=1.0.1->argon2-cffi-bindings->argon2-cffi->notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.0.0->cufflinks) (2.21)
Requirement already satisfied: MarkupSafe>=0.23 in c:\users\ployh\anaconda3\lib\site-packages (from jinja2->notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.0.0->cufflinks) (2.0.1)
Requirement already satisfied: mistune<2,>=0.8.1 in c:\users\ployh\anaconda3\lib\site-packages (from nbconvert->notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.0.0->cufflinks) (0.8.4)
Requirement already satisfied: entrypoints>=0.2.2 in c:\users\ployh\anaconda3\lib\site-packages (from nbconvert->notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.0.0->cufflinks) (0.4)
Requirement already satisfied: bleach in c:\users\ployh\anaconda3\lib\site-packages (from nbconvert->notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.0.0->cufflinks) (4.1.0)
Requirement already satisfied: jupyterlab-pygments in c:\users\ployh\anaconda3\lib\site-packages (from nbconvert->notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.0.0->cufflinks) (0.1.2)
Requirement already satisfied: testpath in c:\users\ployh\anaconda3\lib\site-packages (from nbconvert->notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.0.0->cufflinks) (0.5.0)
Requirement already satisfied: defusedxml in c:\users\ployh\anaconda3\lib\site-packages (from nbconvert->notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.0.0->cufflinks) (0.7.1)
Requirement already satisfied: beautifulsoup4 in c:\users\ployh\anaconda3\lib\site-packages (from nbconvert->notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.0.0->cufflinks) (4.11.1)
Requirement already satisfied: nbclient<0.6.0,>=0.5.0 in c:\users\ployh\anaconda3\lib\site-packages (from nbconvert->notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.0.0->cufflinks) (0.5.13)
Requirement already satisfied: pandocfilters>=1.4.1 in c:\users\ployh\anaconda3\lib\site-packages (from nbconvert->notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.0.0->cufflinks) (1.5.0)
Requirement already satisfied: soupsieve>1.2 in c:\users\ployh\anaconda3\lib\site-packages (from beautifulsoup4->nbconvert->notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.0.0->cufflinks) (2.3.1)
Requirement already satisfied: webencodings in c:\users\ployh\anaconda3\lib\site-packages (from bleach->nbconvert->notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.0.0->cufflinks) (0.5.1)
Requirement already satisfied: packaging in c:\users\ployh\anaconda3\lib\site-packages (from bleach->nbconvert->notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.0.0->cufflinks) (21.3)
Requirement already satisfied: pyparsing!=3.0.5,>=2.0.2 in c:\users\ployh\anaconda3\lib\site-packages (from packaging->bleach->nbconvert->notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.0.0->cufflinks) (3.0.4)
Requirement already satisfied: pure-eval in c:\users\ployh\anaconda3\lib\site-packages (from stack-data->ipython>=5.3.0->cufflinks) (0.2.2)
Requirement already satisfied: executing in c:\users\ployh\anaconda3\lib\site-packages (from stack-data->ipython>=5.3.0->cufflinks) (0.8.3)
Requirement already satisfied: asttokens in c:\users\ployh\anaconda3\lib\site-packages (from stack-data->ipython>=5.3.0->cufflinks) (2.0.5)
Note: you may need to restart the kernel to use updated packages.
In [3]:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
from scipy.stats import norm, skew

from scipy import stats
import statsmodels.api as sm
import matplotlib.ticker as mtic
import seaborn as sns

import essential machine learning

In [4]:
from sklearn.impute import SimpleImputer
from sklearn.preprocessing import LabelEncoder, OneHotEncoder
from sklearn.compose import ColumnTransformer
from sklearn.preprocessing import OneHotEncoder
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.preprocessing import OrdinalEncoder
from sklearn.preprocessing import OneHotEncoder
from sklearn.preprocessing import MinMaxScaler
from sklearn.preprocessing import StandardScaler
from sklearn import svm, tree, linear_model, neighbors
from sklearn import naive_bayes, ensemble, discriminant_analysis, gaussian_process
from sklearn.neighbors import KNeighborsClassifier
from sklearn.discriminant_analysis import LinearDiscriminantAnalysis
from xgboost import XGBClassifier
from catboost import CatBoostClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.svm import SVC
from sklearn.linear_model import RidgeClassifierCV
from sklearn.neighbors import KNeighborsClassifier
from sklearn.naive_bayes import GaussianNB
from sklearn.tree import DecisionTreeClassifier
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import confusion_matrix, accuracy_score 
from sklearn.metrics import f1_score, precision_score, recall_score, fbeta_score
from statsmodels.stats.outliers_influence import variance_inflation_factor
from sklearn.model_selection import cross_val_score
from sklearn.model_selection import GridSearchCV
from sklearn.model_selection import ShuffleSplit
from sklearn.model_selection import KFold
import sklearn
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.model_selection import train_test_split
from sklearn import feature_selection
from sklearn import model_selection
from sklearn import metrics
from sklearn.metrics import classification_report, precision_recall_curve
from sklearn.metrics import auc, roc_auc_score, roc_curve
from sklearn.metrics import make_scorer, recall_score, log_loss
from sklearn.metrics import average_precision_score
from sklearn.metrics import plot_confusion_matrix
import seaborn as sns

from matplotlib import pyplot
import matplotlib.pyplot as plt
import matplotlib.pylab as pylab
import matplotlib 
%matplotlib inline
from pylab import rcParams
color = sns.color_palette()
import matplotlib.ticker as mtick
from IPython.display import display
pd.options.display.max_columns = None
from pandas.plotting import scatter_matrix
from sklearn.metrics import roc_curve
from lightgbm import LGBMRegressor, LGBMClassifier, Booster
init_func = LGBMRegressor
import plotly 
import plotly.express as px
import plotly.graph_objs as go
import plotly.offline as py
from plotly.offline import iplot
from plotly.subplots import make_subplots
import plotly.figure_factory as ff
import shap 
import cufflinks as cf
import warnings
warnings.filterwarnings("ignore")
In [5]:
import random
import os
import re
import sys
import timeit
import string
import time
from datetime import datetime
from time import time
from dateutil.parser import parse
import joblib
import warnings
warnings.filterwarnings("ignore")
In [6]:
pip install xgboost
Requirement already satisfied: xgboost in c:\users\ployh\anaconda3\lib\site-packages (1.6.1)
Requirement already satisfied: scipy in c:\users\ployh\anaconda3\lib\site-packages (from xgboost) (1.7.3)
Requirement already satisfied: numpy in c:\users\ployh\anaconda3\lib\site-packages (from xgboost) (1.21.5)
Note: you may need to restart the kernel to use updated packages.
In [7]:
data = pd.read_csv('C:\\Users\\ployh\\OneDrive\\Desktop\\Data .csv')
In [8]:
data.head()
Out[8]:
customerID gender SeniorCitizen Partner Dependents tenure PhoneService MultipleLines InternetService OnlineSecurity OnlineBackup DeviceProtection TechSupport StreamingTV StreamingMovies Contract PaperlessBilling PaymentMethod MonthlyCharges TotalCharges Churn
0 7590-VHVEG Female 0 Yes No 1 No No phone service DSL No Yes No No No No Month-to-month Yes Electronic check 29.85 29.85 No
1 5575-GNVDE Male 0 No No 34 Yes No DSL Yes No Yes No No No One year No Mailed check 56.95 1889.5 No
2 3668-QPYBK Male 0 No No 2 Yes No DSL Yes Yes No No No No Month-to-month Yes Mailed check 53.85 108.15 Yes
3 7795-CFOCW Male 0 No No 45 No No phone service DSL Yes No Yes Yes No No One year No Bank transfer (automatic) 42.30 1840.75 No
4 9237-HQITU Female 0 No No 2 Yes No Fiber optic No No No No No No Month-to-month Yes Electronic check 70.70 151.65 Yes
In [9]:
data.columns
Out[9]:
Index(['customerID', 'gender', 'SeniorCitizen', 'Partner', 'Dependents',
       'tenure', 'PhoneService', 'MultipleLines', 'InternetService',
       'OnlineSecurity', 'OnlineBackup', 'DeviceProtection', 'TechSupport',
       'StreamingTV', 'StreamingMovies', 'Contract', 'PaperlessBilling',
       'PaymentMethod', 'MonthlyCharges', 'TotalCharges', 'Churn'],
      dtype='object')
In [10]:
data.describe()
Out[10]:
SeniorCitizen tenure MonthlyCharges
count 7043.000000 7043.000000 7043.000000
mean 0.162147 32.371149 64.761692
std 0.368612 24.559481 30.090047
min 0.000000 0.000000 18.250000
25% 0.000000 9.000000 35.500000
50% 0.000000 29.000000 70.350000
75% 0.000000 55.000000 89.850000
max 1.000000 72.000000 118.750000
In [11]:
data.dtypes
Out[11]:
customerID           object
gender               object
SeniorCitizen         int64
Partner              object
Dependents           object
tenure                int64
PhoneService         object
MultipleLines        object
InternetService      object
OnlineSecurity       object
OnlineBackup         object
DeviceProtection     object
TechSupport          object
StreamingTV          object
StreamingMovies      object
Contract             object
PaperlessBilling     object
PaymentMethod        object
MonthlyCharges      float64
TotalCharges         object
Churn                object
dtype: object
In [12]:
data.columns.to_series().groupby(data.dtypes).groups
Out[12]:
{int64: ['SeniorCitizen', 'tenure'], float64: ['MonthlyCharges'], object: ['customerID', 'gender', 'Partner', 'Dependents', 'PhoneService', 'MultipleLines', 'InternetService', 'OnlineSecurity', 'OnlineBackup', 'DeviceProtection', 'TechSupport', 'StreamingTV', 'StreamingMovies', 'Contract', 'PaperlessBilling', 'PaymentMethod', 'TotalCharges', 'Churn']}
In [13]:
data.isna().any()
Out[13]:
customerID          False
gender              False
SeniorCitizen       False
Partner             False
Dependents          False
tenure              False
PhoneService        False
MultipleLines       False
InternetService     False
OnlineSecurity      False
OnlineBackup        False
DeviceProtection    False
TechSupport         False
StreamingTV         False
StreamingMovies     False
Contract            False
PaperlessBilling    False
PaymentMethod       False
MonthlyCharges      False
TotalCharges        False
Churn               False
dtype: bool

Unique values within every category variable

In [14]:
data["PaymentMethod"].nunique()
data["PaymentMethod"].unique()
data["Contract"].nunique()
data["Contract"].unique()
Out[14]:
array(['Month-to-month', 'One year', 'Two year'], dtype=object)

Check distribution of variable of interest

In [15]:
data["Churn"].value_counts()
Out[15]:
No     5174
Yes    1869
Name: Churn, dtype: int64

The dataset is not balance and 1869 customer is likely to leave the company

In [16]:
data['TotalCharges'] = pd.to_numeric(data['TotalCharges'],errors='coerce')
data['TotalCharges'] = data['TotalCharges'].astype("float")
In [17]:
data.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 7043 entries, 0 to 7042
Data columns (total 21 columns):
 #   Column            Non-Null Count  Dtype  
---  ------            --------------  -----  
 0   customerID        7043 non-null   object 
 1   gender            7043 non-null   object 
 2   SeniorCitizen     7043 non-null   int64  
 3   Partner           7043 non-null   object 
 4   Dependents        7043 non-null   object 
 5   tenure            7043 non-null   int64  
 6   PhoneService      7043 non-null   object 
 7   MultipleLines     7043 non-null   object 
 8   InternetService   7043 non-null   object 
 9   OnlineSecurity    7043 non-null   object 
 10  OnlineBackup      7043 non-null   object 
 11  DeviceProtection  7043 non-null   object 
 12  TechSupport       7043 non-null   object 
 13  StreamingTV       7043 non-null   object 
 14  StreamingMovies   7043 non-null   object 
 15  Contract          7043 non-null   object 
 16  PaperlessBilling  7043 non-null   object 
 17  PaymentMethod     7043 non-null   object 
 18  MonthlyCharges    7043 non-null   float64
 19  TotalCharges      7032 non-null   float64
 20  Churn             7043 non-null   object 
dtypes: float64(2), int64(2), object(17)
memory usage: 1.1+ MB

There is some missing value in Total Charge. So we will fillna in the further step

In [18]:
data.isna().any()
Out[18]:
customerID          False
gender              False
SeniorCitizen       False
Partner             False
Dependents          False
tenure              False
PhoneService        False
MultipleLines       False
InternetService     False
OnlineSecurity      False
OnlineBackup        False
DeviceProtection    False
TechSupport         False
StreamingTV         False
StreamingMovies     False
Contract            False
PaperlessBilling    False
PaymentMethod       False
MonthlyCharges      False
TotalCharges         True
Churn               False
dtype: bool
In [19]:
fillna = data.isna().any()
fillna = fillna[fillna == True].reset_index()
fillna = fillna["index"].tolist()
for col in data.columns[1:]:
     if col in fillna:
        if data[col].dtype != 'object':
             data[col] =  data[col].fillna(data[col].mean()).round(0)
In [20]:
data.isna().any()
Out[20]:
customerID          False
gender              False
SeniorCitizen       False
Partner             False
Dependents          False
tenure              False
PhoneService        False
MultipleLines       False
InternetService     False
OnlineSecurity      False
OnlineBackup        False
DeviceProtection    False
TechSupport         False
StreamingTV         False
StreamingMovies     False
Contract            False
PaperlessBilling    False
PaymentMethod       False
MonthlyCharges      False
TotalCharges        False
Churn               False
dtype: bool

drop unneccessary column, which is customer id

In [21]:
data2 = data.drop('customerID', axis=1).copy()
In [22]:
data2.head()
Out[22]:
gender SeniorCitizen Partner Dependents tenure PhoneService MultipleLines InternetService OnlineSecurity OnlineBackup DeviceProtection TechSupport StreamingTV StreamingMovies Contract PaperlessBilling PaymentMethod MonthlyCharges TotalCharges Churn
0 Female 0 Yes No 1 No No phone service DSL No Yes No No No No Month-to-month Yes Electronic check 29.85 30.0 No
1 Male 0 No No 34 Yes No DSL Yes No Yes No No No One year No Mailed check 56.95 1890.0 No
2 Male 0 No No 2 Yes No DSL Yes Yes No No No No Month-to-month Yes Mailed check 53.85 108.0 Yes
3 Male 0 No No 45 No No phone service DSL Yes No Yes Yes No No One year No Bank transfer (automatic) 42.30 1841.0 No
4 Female 0 No No 2 Yes No Fiber optic No No No No No No Month-to-month Yes Electronic check 70.70 152.0 Yes
In [23]:
data2['TotalCharges'][3826]
Out[23]:
2283.0
In [24]:
data2.iloc[3826]
Out[24]:
gender                             Male
SeniorCitizen                         0
Partner                             Yes
Dependents                          Yes
tenure                                0
PhoneService                        Yes
MultipleLines                       Yes
InternetService                      No
OnlineSecurity      No internet service
OnlineBackup        No internet service
DeviceProtection    No internet service
TechSupport         No internet service
StreamingTV         No internet service
StreamingMovies     No internet service
Contract                       Two year
PaperlessBilling                     No
PaymentMethod              Mailed check
MonthlyCharges                    25.35
TotalCharges                     2283.0
Churn                                No
Name: 3826, dtype: object
In [25]:
data2['TotalCharges']= data2['TotalCharges'].apply(lambda x: x if x!= ' ' else np.nan).astype(float)
In [26]:
data2.head()
Out[26]:
gender SeniorCitizen Partner Dependents tenure PhoneService MultipleLines InternetService OnlineSecurity OnlineBackup DeviceProtection TechSupport StreamingTV StreamingMovies Contract PaperlessBilling PaymentMethod MonthlyCharges TotalCharges Churn
0 Female 0 Yes No 1 No No phone service DSL No Yes No No No No Month-to-month Yes Electronic check 29.85 30.0 No
1 Male 0 No No 34 Yes No DSL Yes No Yes No No No One year No Mailed check 56.95 1890.0 No
2 Male 0 No No 2 Yes No DSL Yes Yes No No No No Month-to-month Yes Mailed check 53.85 108.0 Yes
3 Male 0 No No 45 No No phone service DSL Yes No Yes Yes No No One year No Bank transfer (automatic) 42.30 1841.0 No
4 Female 0 No No 2 Yes No Fiber optic No No No No No No Month-to-month Yes Electronic check 70.70 152.0 Yes

Encoding part

In [27]:
lebel = LabelEncoder()
data2['Churn']=lebel.fit_transform(data2['Churn'])
In [28]:
numeric= data2.select_dtypes('number').columns
category = data2.select_dtypes('object').columns
In [29]:
matrix = np.triu(data2[numeric].corr())
fig, ax = plt.subplots(figsize=(14,10)) 
sns.heatmap (data2[numeric].corr(), annot=True, cmap='viridis',mask=matrix, ax=ax)
Out[29]:
<AxesSubplot:>

Categorical Features

In [30]:
data2[category].nunique()
Out[30]:
gender              2
Partner             2
Dependents          2
PhoneService        2
MultipleLines       3
InternetService     3
OnlineSecurity      3
OnlineBackup        3
DeviceProtection    3
TechSupport         3
StreamingTV         3
StreamingMovies     3
Contract            3
PaperlessBilling    2
PaymentMethod       4
dtype: int64
In [31]:
for feature in data2[category]:
        print(data2)
      gender  SeniorCitizen Partner Dependents  tenure PhoneService  \
0     Female              0     Yes         No       1           No   
1       Male              0      No         No      34          Yes   
2       Male              0      No         No       2          Yes   
3       Male              0      No         No      45           No   
4     Female              0      No         No       2          Yes   
...      ...            ...     ...        ...     ...          ...   
7038    Male              0     Yes        Yes      24          Yes   
7039  Female              0     Yes        Yes      72          Yes   
7040  Female              0     Yes        Yes      11           No   
7041    Male              1     Yes         No       4          Yes   
7042    Male              0      No         No      66          Yes   

         MultipleLines InternetService OnlineSecurity OnlineBackup  \
0     No phone service             DSL             No          Yes   
1                   No             DSL            Yes           No   
2                   No             DSL            Yes          Yes   
3     No phone service             DSL            Yes           No   
4                   No     Fiber optic             No           No   
...                ...             ...            ...          ...   
7038               Yes             DSL            Yes           No   
7039               Yes     Fiber optic             No          Yes   
7040  No phone service             DSL            Yes           No   
7041               Yes     Fiber optic             No           No   
7042                No     Fiber optic            Yes           No   

     DeviceProtection TechSupport StreamingTV StreamingMovies        Contract  \
0                  No          No          No              No  Month-to-month   
1                 Yes          No          No              No        One year   
2                  No          No          No              No  Month-to-month   
3                 Yes         Yes          No              No        One year   
4                  No          No          No              No  Month-to-month   
...               ...         ...         ...             ...             ...   
7038              Yes         Yes         Yes             Yes        One year   
7039              Yes          No         Yes             Yes        One year   
7040               No          No          No              No  Month-to-month   
7041               No          No          No              No  Month-to-month   
7042              Yes         Yes         Yes             Yes        Two year   

     PaperlessBilling              PaymentMethod  MonthlyCharges  \
0                 Yes           Electronic check           29.85   
1                  No               Mailed check           56.95   
2                 Yes               Mailed check           53.85   
3                  No  Bank transfer (automatic)           42.30   
4                 Yes           Electronic check           70.70   
...               ...                        ...             ...   
7038              Yes               Mailed check           84.80   
7039              Yes    Credit card (automatic)          103.20   
7040              Yes           Electronic check           29.60   
7041              Yes               Mailed check           74.40   
7042              Yes  Bank transfer (automatic)          105.65   

      TotalCharges  Churn  
0             30.0      0  
1           1890.0      0  
2            108.0      1  
3           1841.0      0  
4            152.0      1  
...            ...    ...  
7038        1990.0      0  
7039        7363.0      0  
7040         346.0      0  
7041         307.0      1  
7042        6844.0      0  

[7043 rows x 20 columns]
      gender  SeniorCitizen Partner Dependents  tenure PhoneService  \
0     Female              0     Yes         No       1           No   
1       Male              0      No         No      34          Yes   
2       Male              0      No         No       2          Yes   
3       Male              0      No         No      45           No   
4     Female              0      No         No       2          Yes   
...      ...            ...     ...        ...     ...          ...   
7038    Male              0     Yes        Yes      24          Yes   
7039  Female              0     Yes        Yes      72          Yes   
7040  Female              0     Yes        Yes      11           No   
7041    Male              1     Yes         No       4          Yes   
7042    Male              0      No         No      66          Yes   

         MultipleLines InternetService OnlineSecurity OnlineBackup  \
0     No phone service             DSL             No          Yes   
1                   No             DSL            Yes           No   
2                   No             DSL            Yes          Yes   
3     No phone service             DSL            Yes           No   
4                   No     Fiber optic             No           No   
...                ...             ...            ...          ...   
7038               Yes             DSL            Yes           No   
7039               Yes     Fiber optic             No          Yes   
7040  No phone service             DSL            Yes           No   
7041               Yes     Fiber optic             No           No   
7042                No     Fiber optic            Yes           No   

     DeviceProtection TechSupport StreamingTV StreamingMovies        Contract  \
0                  No          No          No              No  Month-to-month   
1                 Yes          No          No              No        One year   
2                  No          No          No              No  Month-to-month   
3                 Yes         Yes          No              No        One year   
4                  No          No          No              No  Month-to-month   
...               ...         ...         ...             ...             ...   
7038              Yes         Yes         Yes             Yes        One year   
7039              Yes          No         Yes             Yes        One year   
7040               No          No          No              No  Month-to-month   
7041               No          No          No              No  Month-to-month   
7042              Yes         Yes         Yes             Yes        Two year   

     PaperlessBilling              PaymentMethod  MonthlyCharges  \
0                 Yes           Electronic check           29.85   
1                  No               Mailed check           56.95   
2                 Yes               Mailed check           53.85   
3                  No  Bank transfer (automatic)           42.30   
4                 Yes           Electronic check           70.70   
...               ...                        ...             ...   
7038              Yes               Mailed check           84.80   
7039              Yes    Credit card (automatic)          103.20   
7040              Yes           Electronic check           29.60   
7041              Yes               Mailed check           74.40   
7042              Yes  Bank transfer (automatic)          105.65   

      TotalCharges  Churn  
0             30.0      0  
1           1890.0      0  
2            108.0      1  
3           1841.0      0  
4            152.0      1  
...            ...    ...  
7038        1990.0      0  
7039        7363.0      0  
7040         346.0      0  
7041         307.0      1  
7042        6844.0      0  

[7043 rows x 20 columns]
      gender  SeniorCitizen Partner Dependents  tenure PhoneService  \
0     Female              0     Yes         No       1           No   
1       Male              0      No         No      34          Yes   
2       Male              0      No         No       2          Yes   
3       Male              0      No         No      45           No   
4     Female              0      No         No       2          Yes   
...      ...            ...     ...        ...     ...          ...   
7038    Male              0     Yes        Yes      24          Yes   
7039  Female              0     Yes        Yes      72          Yes   
7040  Female              0     Yes        Yes      11           No   
7041    Male              1     Yes         No       4          Yes   
7042    Male              0      No         No      66          Yes   

         MultipleLines InternetService OnlineSecurity OnlineBackup  \
0     No phone service             DSL             No          Yes   
1                   No             DSL            Yes           No   
2                   No             DSL            Yes          Yes   
3     No phone service             DSL            Yes           No   
4                   No     Fiber optic             No           No   
...                ...             ...            ...          ...   
7038               Yes             DSL            Yes           No   
7039               Yes     Fiber optic             No          Yes   
7040  No phone service             DSL            Yes           No   
7041               Yes     Fiber optic             No           No   
7042                No     Fiber optic            Yes           No   

     DeviceProtection TechSupport StreamingTV StreamingMovies        Contract  \
0                  No          No          No              No  Month-to-month   
1                 Yes          No          No              No        One year   
2                  No          No          No              No  Month-to-month   
3                 Yes         Yes          No              No        One year   
4                  No          No          No              No  Month-to-month   
...               ...         ...         ...             ...             ...   
7038              Yes         Yes         Yes             Yes        One year   
7039              Yes          No         Yes             Yes        One year   
7040               No          No          No              No  Month-to-month   
7041               No          No          No              No  Month-to-month   
7042              Yes         Yes         Yes             Yes        Two year   

     PaperlessBilling              PaymentMethod  MonthlyCharges  \
0                 Yes           Electronic check           29.85   
1                  No               Mailed check           56.95   
2                 Yes               Mailed check           53.85   
3                  No  Bank transfer (automatic)           42.30   
4                 Yes           Electronic check           70.70   
...               ...                        ...             ...   
7038              Yes               Mailed check           84.80   
7039              Yes    Credit card (automatic)          103.20   
7040              Yes           Electronic check           29.60   
7041              Yes               Mailed check           74.40   
7042              Yes  Bank transfer (automatic)          105.65   

      TotalCharges  Churn  
0             30.0      0  
1           1890.0      0  
2            108.0      1  
3           1841.0      0  
4            152.0      1  
...            ...    ...  
7038        1990.0      0  
7039        7363.0      0  
7040         346.0      0  
7041         307.0      1  
7042        6844.0      0  

[7043 rows x 20 columns]
      gender  SeniorCitizen Partner Dependents  tenure PhoneService  \
0     Female              0     Yes         No       1           No   
1       Male              0      No         No      34          Yes   
2       Male              0      No         No       2          Yes   
3       Male              0      No         No      45           No   
4     Female              0      No         No       2          Yes   
...      ...            ...     ...        ...     ...          ...   
7038    Male              0     Yes        Yes      24          Yes   
7039  Female              0     Yes        Yes      72          Yes   
7040  Female              0     Yes        Yes      11           No   
7041    Male              1     Yes         No       4          Yes   
7042    Male              0      No         No      66          Yes   

         MultipleLines InternetService OnlineSecurity OnlineBackup  \
0     No phone service             DSL             No          Yes   
1                   No             DSL            Yes           No   
2                   No             DSL            Yes          Yes   
3     No phone service             DSL            Yes           No   
4                   No     Fiber optic             No           No   
...                ...             ...            ...          ...   
7038               Yes             DSL            Yes           No   
7039               Yes     Fiber optic             No          Yes   
7040  No phone service             DSL            Yes           No   
7041               Yes     Fiber optic             No           No   
7042                No     Fiber optic            Yes           No   

     DeviceProtection TechSupport StreamingTV StreamingMovies        Contract  \
0                  No          No          No              No  Month-to-month   
1                 Yes          No          No              No        One year   
2                  No          No          No              No  Month-to-month   
3                 Yes         Yes          No              No        One year   
4                  No          No          No              No  Month-to-month   
...               ...         ...         ...             ...             ...   
7038              Yes         Yes         Yes             Yes        One year   
7039              Yes          No         Yes             Yes        One year   
7040               No          No          No              No  Month-to-month   
7041               No          No          No              No  Month-to-month   
7042              Yes         Yes         Yes             Yes        Two year   

     PaperlessBilling              PaymentMethod  MonthlyCharges  \
0                 Yes           Electronic check           29.85   
1                  No               Mailed check           56.95   
2                 Yes               Mailed check           53.85   
3                  No  Bank transfer (automatic)           42.30   
4                 Yes           Electronic check           70.70   
...               ...                        ...             ...   
7038              Yes               Mailed check           84.80   
7039              Yes    Credit card (automatic)          103.20   
7040              Yes           Electronic check           29.60   
7041              Yes               Mailed check           74.40   
7042              Yes  Bank transfer (automatic)          105.65   

      TotalCharges  Churn  
0             30.0      0  
1           1890.0      0  
2            108.0      1  
3           1841.0      0  
4            152.0      1  
...            ...    ...  
7038        1990.0      0  
7039        7363.0      0  
7040         346.0      0  
7041         307.0      1  
7042        6844.0      0  

[7043 rows x 20 columns]
      gender  SeniorCitizen Partner Dependents  tenure PhoneService  \
0     Female              0     Yes         No       1           No   
1       Male              0      No         No      34          Yes   
2       Male              0      No         No       2          Yes   
3       Male              0      No         No      45           No   
4     Female              0      No         No       2          Yes   
...      ...            ...     ...        ...     ...          ...   
7038    Male              0     Yes        Yes      24          Yes   
7039  Female              0     Yes        Yes      72          Yes   
7040  Female              0     Yes        Yes      11           No   
7041    Male              1     Yes         No       4          Yes   
7042    Male              0      No         No      66          Yes   

         MultipleLines InternetService OnlineSecurity OnlineBackup  \
0     No phone service             DSL             No          Yes   
1                   No             DSL            Yes           No   
2                   No             DSL            Yes          Yes   
3     No phone service             DSL            Yes           No   
4                   No     Fiber optic             No           No   
...                ...             ...            ...          ...   
7038               Yes             DSL            Yes           No   
7039               Yes     Fiber optic             No          Yes   
7040  No phone service             DSL            Yes           No   
7041               Yes     Fiber optic             No           No   
7042                No     Fiber optic            Yes           No   

     DeviceProtection TechSupport StreamingTV StreamingMovies        Contract  \
0                  No          No          No              No  Month-to-month   
1                 Yes          No          No              No        One year   
2                  No          No          No              No  Month-to-month   
3                 Yes         Yes          No              No        One year   
4                  No          No          No              No  Month-to-month   
...               ...         ...         ...             ...             ...   
7038              Yes         Yes         Yes             Yes        One year   
7039              Yes          No         Yes             Yes        One year   
7040               No          No          No              No  Month-to-month   
7041               No          No          No              No  Month-to-month   
7042              Yes         Yes         Yes             Yes        Two year   

     PaperlessBilling              PaymentMethod  MonthlyCharges  \
0                 Yes           Electronic check           29.85   
1                  No               Mailed check           56.95   
2                 Yes               Mailed check           53.85   
3                  No  Bank transfer (automatic)           42.30   
4                 Yes           Electronic check           70.70   
...               ...                        ...             ...   
7038              Yes               Mailed check           84.80   
7039              Yes    Credit card (automatic)          103.20   
7040              Yes           Electronic check           29.60   
7041              Yes               Mailed check           74.40   
7042              Yes  Bank transfer (automatic)          105.65   

      TotalCharges  Churn  
0             30.0      0  
1           1890.0      0  
2            108.0      1  
3           1841.0      0  
4            152.0      1  
...            ...    ...  
7038        1990.0      0  
7039        7363.0      0  
7040         346.0      0  
7041         307.0      1  
7042        6844.0      0  

[7043 rows x 20 columns]
      gender  SeniorCitizen Partner Dependents  tenure PhoneService  \
0     Female              0     Yes         No       1           No   
1       Male              0      No         No      34          Yes   
2       Male              0      No         No       2          Yes   
3       Male              0      No         No      45           No   
4     Female              0      No         No       2          Yes   
...      ...            ...     ...        ...     ...          ...   
7038    Male              0     Yes        Yes      24          Yes   
7039  Female              0     Yes        Yes      72          Yes   
7040  Female              0     Yes        Yes      11           No   
7041    Male              1     Yes         No       4          Yes   
7042    Male              0      No         No      66          Yes   

         MultipleLines InternetService OnlineSecurity OnlineBackup  \
0     No phone service             DSL             No          Yes   
1                   No             DSL            Yes           No   
2                   No             DSL            Yes          Yes   
3     No phone service             DSL            Yes           No   
4                   No     Fiber optic             No           No   
...                ...             ...            ...          ...   
7038               Yes             DSL            Yes           No   
7039               Yes     Fiber optic             No          Yes   
7040  No phone service             DSL            Yes           No   
7041               Yes     Fiber optic             No           No   
7042                No     Fiber optic            Yes           No   

     DeviceProtection TechSupport StreamingTV StreamingMovies        Contract  \
0                  No          No          No              No  Month-to-month   
1                 Yes          No          No              No        One year   
2                  No          No          No              No  Month-to-month   
3                 Yes         Yes          No              No        One year   
4                  No          No          No              No  Month-to-month   
...               ...         ...         ...             ...             ...   
7038              Yes         Yes         Yes             Yes        One year   
7039              Yes          No         Yes             Yes        One year   
7040               No          No          No              No  Month-to-month   
7041               No          No          No              No  Month-to-month   
7042              Yes         Yes         Yes             Yes        Two year   

     PaperlessBilling              PaymentMethod  MonthlyCharges  \
0                 Yes           Electronic check           29.85   
1                  No               Mailed check           56.95   
2                 Yes               Mailed check           53.85   
3                  No  Bank transfer (automatic)           42.30   
4                 Yes           Electronic check           70.70   
...               ...                        ...             ...   
7038              Yes               Mailed check           84.80   
7039              Yes    Credit card (automatic)          103.20   
7040              Yes           Electronic check           29.60   
7041              Yes               Mailed check           74.40   
7042              Yes  Bank transfer (automatic)          105.65   

      TotalCharges  Churn  
0             30.0      0  
1           1890.0      0  
2            108.0      1  
3           1841.0      0  
4            152.0      1  
...            ...    ...  
7038        1990.0      0  
7039        7363.0      0  
7040         346.0      0  
7041         307.0      1  
7042        6844.0      0  

[7043 rows x 20 columns]
      gender  SeniorCitizen Partner Dependents  tenure PhoneService  \
0     Female              0     Yes         No       1           No   
1       Male              0      No         No      34          Yes   
2       Male              0      No         No       2          Yes   
3       Male              0      No         No      45           No   
4     Female              0      No         No       2          Yes   
...      ...            ...     ...        ...     ...          ...   
7038    Male              0     Yes        Yes      24          Yes   
7039  Female              0     Yes        Yes      72          Yes   
7040  Female              0     Yes        Yes      11           No   
7041    Male              1     Yes         No       4          Yes   
7042    Male              0      No         No      66          Yes   

         MultipleLines InternetService OnlineSecurity OnlineBackup  \
0     No phone service             DSL             No          Yes   
1                   No             DSL            Yes           No   
2                   No             DSL            Yes          Yes   
3     No phone service             DSL            Yes           No   
4                   No     Fiber optic             No           No   
...                ...             ...            ...          ...   
7038               Yes             DSL            Yes           No   
7039               Yes     Fiber optic             No          Yes   
7040  No phone service             DSL            Yes           No   
7041               Yes     Fiber optic             No           No   
7042                No     Fiber optic            Yes           No   

     DeviceProtection TechSupport StreamingTV StreamingMovies        Contract  \
0                  No          No          No              No  Month-to-month   
1                 Yes          No          No              No        One year   
2                  No          No          No              No  Month-to-month   
3                 Yes         Yes          No              No        One year   
4                  No          No          No              No  Month-to-month   
...               ...         ...         ...             ...             ...   
7038              Yes         Yes         Yes             Yes        One year   
7039              Yes          No         Yes             Yes        One year   
7040               No          No          No              No  Month-to-month   
7041               No          No          No              No  Month-to-month   
7042              Yes         Yes         Yes             Yes        Two year   

     PaperlessBilling              PaymentMethod  MonthlyCharges  \
0                 Yes           Electronic check           29.85   
1                  No               Mailed check           56.95   
2                 Yes               Mailed check           53.85   
3                  No  Bank transfer (automatic)           42.30   
4                 Yes           Electronic check           70.70   
...               ...                        ...             ...   
7038              Yes               Mailed check           84.80   
7039              Yes    Credit card (automatic)          103.20   
7040              Yes           Electronic check           29.60   
7041              Yes               Mailed check           74.40   
7042              Yes  Bank transfer (automatic)          105.65   

      TotalCharges  Churn  
0             30.0      0  
1           1890.0      0  
2            108.0      1  
3           1841.0      0  
4            152.0      1  
...            ...    ...  
7038        1990.0      0  
7039        7363.0      0  
7040         346.0      0  
7041         307.0      1  
7042        6844.0      0  

[7043 rows x 20 columns]
      gender  SeniorCitizen Partner Dependents  tenure PhoneService  \
0     Female              0     Yes         No       1           No   
1       Male              0      No         No      34          Yes   
2       Male              0      No         No       2          Yes   
3       Male              0      No         No      45           No   
4     Female              0      No         No       2          Yes   
...      ...            ...     ...        ...     ...          ...   
7038    Male              0     Yes        Yes      24          Yes   
7039  Female              0     Yes        Yes      72          Yes   
7040  Female              0     Yes        Yes      11           No   
7041    Male              1     Yes         No       4          Yes   
7042    Male              0      No         No      66          Yes   

         MultipleLines InternetService OnlineSecurity OnlineBackup  \
0     No phone service             DSL             No          Yes   
1                   No             DSL            Yes           No   
2                   No             DSL            Yes          Yes   
3     No phone service             DSL            Yes           No   
4                   No     Fiber optic             No           No   
...                ...             ...            ...          ...   
7038               Yes             DSL            Yes           No   
7039               Yes     Fiber optic             No          Yes   
7040  No phone service             DSL            Yes           No   
7041               Yes     Fiber optic             No           No   
7042                No     Fiber optic            Yes           No   

     DeviceProtection TechSupport StreamingTV StreamingMovies        Contract  \
0                  No          No          No              No  Month-to-month   
1                 Yes          No          No              No        One year   
2                  No          No          No              No  Month-to-month   
3                 Yes         Yes          No              No        One year   
4                  No          No          No              No  Month-to-month   
...               ...         ...         ...             ...             ...   
7038              Yes         Yes         Yes             Yes        One year   
7039              Yes          No         Yes             Yes        One year   
7040               No          No          No              No  Month-to-month   
7041               No          No          No              No  Month-to-month   
7042              Yes         Yes         Yes             Yes        Two year   

     PaperlessBilling              PaymentMethod  MonthlyCharges  \
0                 Yes           Electronic check           29.85   
1                  No               Mailed check           56.95   
2                 Yes               Mailed check           53.85   
3                  No  Bank transfer (automatic)           42.30   
4                 Yes           Electronic check           70.70   
...               ...                        ...             ...   
7038              Yes               Mailed check           84.80   
7039              Yes    Credit card (automatic)          103.20   
7040              Yes           Electronic check           29.60   
7041              Yes               Mailed check           74.40   
7042              Yes  Bank transfer (automatic)          105.65   

      TotalCharges  Churn  
0             30.0      0  
1           1890.0      0  
2            108.0      1  
3           1841.0      0  
4            152.0      1  
...            ...    ...  
7038        1990.0      0  
7039        7363.0      0  
7040         346.0      0  
7041         307.0      1  
7042        6844.0      0  

[7043 rows x 20 columns]
      gender  SeniorCitizen Partner Dependents  tenure PhoneService  \
0     Female              0     Yes         No       1           No   
1       Male              0      No         No      34          Yes   
2       Male              0      No         No       2          Yes   
3       Male              0      No         No      45           No   
4     Female              0      No         No       2          Yes   
...      ...            ...     ...        ...     ...          ...   
7038    Male              0     Yes        Yes      24          Yes   
7039  Female              0     Yes        Yes      72          Yes   
7040  Female              0     Yes        Yes      11           No   
7041    Male              1     Yes         No       4          Yes   
7042    Male              0      No         No      66          Yes   

         MultipleLines InternetService OnlineSecurity OnlineBackup  \
0     No phone service             DSL             No          Yes   
1                   No             DSL            Yes           No   
2                   No             DSL            Yes          Yes   
3     No phone service             DSL            Yes           No   
4                   No     Fiber optic             No           No   
...                ...             ...            ...          ...   
7038               Yes             DSL            Yes           No   
7039               Yes     Fiber optic             No          Yes   
7040  No phone service             DSL            Yes           No   
7041               Yes     Fiber optic             No           No   
7042                No     Fiber optic            Yes           No   

     DeviceProtection TechSupport StreamingTV StreamingMovies        Contract  \
0                  No          No          No              No  Month-to-month   
1                 Yes          No          No              No        One year   
2                  No          No          No              No  Month-to-month   
3                 Yes         Yes          No              No        One year   
4                  No          No          No              No  Month-to-month   
...               ...         ...         ...             ...             ...   
7038              Yes         Yes         Yes             Yes        One year   
7039              Yes          No         Yes             Yes        One year   
7040               No          No          No              No  Month-to-month   
7041               No          No          No              No  Month-to-month   
7042              Yes         Yes         Yes             Yes        Two year   

     PaperlessBilling              PaymentMethod  MonthlyCharges  \
0                 Yes           Electronic check           29.85   
1                  No               Mailed check           56.95   
2                 Yes               Mailed check           53.85   
3                  No  Bank transfer (automatic)           42.30   
4                 Yes           Electronic check           70.70   
...               ...                        ...             ...   
7038              Yes               Mailed check           84.80   
7039              Yes    Credit card (automatic)          103.20   
7040              Yes           Electronic check           29.60   
7041              Yes               Mailed check           74.40   
7042              Yes  Bank transfer (automatic)          105.65   

      TotalCharges  Churn  
0             30.0      0  
1           1890.0      0  
2            108.0      1  
3           1841.0      0  
4            152.0      1  
...            ...    ...  
7038        1990.0      0  
7039        7363.0      0  
7040         346.0      0  
7041         307.0      1  
7042        6844.0      0  

[7043 rows x 20 columns]
      gender  SeniorCitizen Partner Dependents  tenure PhoneService  \
0     Female              0     Yes         No       1           No   
1       Male              0      No         No      34          Yes   
2       Male              0      No         No       2          Yes   
3       Male              0      No         No      45           No   
4     Female              0      No         No       2          Yes   
...      ...            ...     ...        ...     ...          ...   
7038    Male              0     Yes        Yes      24          Yes   
7039  Female              0     Yes        Yes      72          Yes   
7040  Female              0     Yes        Yes      11           No   
7041    Male              1     Yes         No       4          Yes   
7042    Male              0      No         No      66          Yes   

         MultipleLines InternetService OnlineSecurity OnlineBackup  \
0     No phone service             DSL             No          Yes   
1                   No             DSL            Yes           No   
2                   No             DSL            Yes          Yes   
3     No phone service             DSL            Yes           No   
4                   No     Fiber optic             No           No   
...                ...             ...            ...          ...   
7038               Yes             DSL            Yes           No   
7039               Yes     Fiber optic             No          Yes   
7040  No phone service             DSL            Yes           No   
7041               Yes     Fiber optic             No           No   
7042                No     Fiber optic            Yes           No   

     DeviceProtection TechSupport StreamingTV StreamingMovies        Contract  \
0                  No          No          No              No  Month-to-month   
1                 Yes          No          No              No        One year   
2                  No          No          No              No  Month-to-month   
3                 Yes         Yes          No              No        One year   
4                  No          No          No              No  Month-to-month   
...               ...         ...         ...             ...             ...   
7038              Yes         Yes         Yes             Yes        One year   
7039              Yes          No         Yes             Yes        One year   
7040               No          No          No              No  Month-to-month   
7041               No          No          No              No  Month-to-month   
7042              Yes         Yes         Yes             Yes        Two year   

     PaperlessBilling              PaymentMethod  MonthlyCharges  \
0                 Yes           Electronic check           29.85   
1                  No               Mailed check           56.95   
2                 Yes               Mailed check           53.85   
3                  No  Bank transfer (automatic)           42.30   
4                 Yes           Electronic check           70.70   
...               ...                        ...             ...   
7038              Yes               Mailed check           84.80   
7039              Yes    Credit card (automatic)          103.20   
7040              Yes           Electronic check           29.60   
7041              Yes               Mailed check           74.40   
7042              Yes  Bank transfer (automatic)          105.65   

      TotalCharges  Churn  
0             30.0      0  
1           1890.0      0  
2            108.0      1  
3           1841.0      0  
4            152.0      1  
...            ...    ...  
7038        1990.0      0  
7039        7363.0      0  
7040         346.0      0  
7041         307.0      1  
7042        6844.0      0  

[7043 rows x 20 columns]
      gender  SeniorCitizen Partner Dependents  tenure PhoneService  \
0     Female              0     Yes         No       1           No   
1       Male              0      No         No      34          Yes   
2       Male              0      No         No       2          Yes   
3       Male              0      No         No      45           No   
4     Female              0      No         No       2          Yes   
...      ...            ...     ...        ...     ...          ...   
7038    Male              0     Yes        Yes      24          Yes   
7039  Female              0     Yes        Yes      72          Yes   
7040  Female              0     Yes        Yes      11           No   
7041    Male              1     Yes         No       4          Yes   
7042    Male              0      No         No      66          Yes   

         MultipleLines InternetService OnlineSecurity OnlineBackup  \
0     No phone service             DSL             No          Yes   
1                   No             DSL            Yes           No   
2                   No             DSL            Yes          Yes   
3     No phone service             DSL            Yes           No   
4                   No     Fiber optic             No           No   
...                ...             ...            ...          ...   
7038               Yes             DSL            Yes           No   
7039               Yes     Fiber optic             No          Yes   
7040  No phone service             DSL            Yes           No   
7041               Yes     Fiber optic             No           No   
7042                No     Fiber optic            Yes           No   

     DeviceProtection TechSupport StreamingTV StreamingMovies        Contract  \
0                  No          No          No              No  Month-to-month   
1                 Yes          No          No              No        One year   
2                  No          No          No              No  Month-to-month   
3                 Yes         Yes          No              No        One year   
4                  No          No          No              No  Month-to-month   
...               ...         ...         ...             ...             ...   
7038              Yes         Yes         Yes             Yes        One year   
7039              Yes          No         Yes             Yes        One year   
7040               No          No          No              No  Month-to-month   
7041               No          No          No              No  Month-to-month   
7042              Yes         Yes         Yes             Yes        Two year   

     PaperlessBilling              PaymentMethod  MonthlyCharges  \
0                 Yes           Electronic check           29.85   
1                  No               Mailed check           56.95   
2                 Yes               Mailed check           53.85   
3                  No  Bank transfer (automatic)           42.30   
4                 Yes           Electronic check           70.70   
...               ...                        ...             ...   
7038              Yes               Mailed check           84.80   
7039              Yes    Credit card (automatic)          103.20   
7040              Yes           Electronic check           29.60   
7041              Yes               Mailed check           74.40   
7042              Yes  Bank transfer (automatic)          105.65   

      TotalCharges  Churn  
0             30.0      0  
1           1890.0      0  
2            108.0      1  
3           1841.0      0  
4            152.0      1  
...            ...    ...  
7038        1990.0      0  
7039        7363.0      0  
7040         346.0      0  
7041         307.0      1  
7042        6844.0      0  

[7043 rows x 20 columns]
      gender  SeniorCitizen Partner Dependents  tenure PhoneService  \
0     Female              0     Yes         No       1           No   
1       Male              0      No         No      34          Yes   
2       Male              0      No         No       2          Yes   
3       Male              0      No         No      45           No   
4     Female              0      No         No       2          Yes   
...      ...            ...     ...        ...     ...          ...   
7038    Male              0     Yes        Yes      24          Yes   
7039  Female              0     Yes        Yes      72          Yes   
7040  Female              0     Yes        Yes      11           No   
7041    Male              1     Yes         No       4          Yes   
7042    Male              0      No         No      66          Yes   

         MultipleLines InternetService OnlineSecurity OnlineBackup  \
0     No phone service             DSL             No          Yes   
1                   No             DSL            Yes           No   
2                   No             DSL            Yes          Yes   
3     No phone service             DSL            Yes           No   
4                   No     Fiber optic             No           No   
...                ...             ...            ...          ...   
7038               Yes             DSL            Yes           No   
7039               Yes     Fiber optic             No          Yes   
7040  No phone service             DSL            Yes           No   
7041               Yes     Fiber optic             No           No   
7042                No     Fiber optic            Yes           No   

     DeviceProtection TechSupport StreamingTV StreamingMovies        Contract  \
0                  No          No          No              No  Month-to-month   
1                 Yes          No          No              No        One year   
2                  No          No          No              No  Month-to-month   
3                 Yes         Yes          No              No        One year   
4                  No          No          No              No  Month-to-month   
...               ...         ...         ...             ...             ...   
7038              Yes         Yes         Yes             Yes        One year   
7039              Yes          No         Yes             Yes        One year   
7040               No          No          No              No  Month-to-month   
7041               No          No          No              No  Month-to-month   
7042              Yes         Yes         Yes             Yes        Two year   

     PaperlessBilling              PaymentMethod  MonthlyCharges  \
0                 Yes           Electronic check           29.85   
1                  No               Mailed check           56.95   
2                 Yes               Mailed check           53.85   
3                  No  Bank transfer (automatic)           42.30   
4                 Yes           Electronic check           70.70   
...               ...                        ...             ...   
7038              Yes               Mailed check           84.80   
7039              Yes    Credit card (automatic)          103.20   
7040              Yes           Electronic check           29.60   
7041              Yes               Mailed check           74.40   
7042              Yes  Bank transfer (automatic)          105.65   

      TotalCharges  Churn  
0             30.0      0  
1           1890.0      0  
2            108.0      1  
3           1841.0      0  
4            152.0      1  
...            ...    ...  
7038        1990.0      0  
7039        7363.0      0  
7040         346.0      0  
7041         307.0      1  
7042        6844.0      0  

[7043 rows x 20 columns]
      gender  SeniorCitizen Partner Dependents  tenure PhoneService  \
0     Female              0     Yes         No       1           No   
1       Male              0      No         No      34          Yes   
2       Male              0      No         No       2          Yes   
3       Male              0      No         No      45           No   
4     Female              0      No         No       2          Yes   
...      ...            ...     ...        ...     ...          ...   
7038    Male              0     Yes        Yes      24          Yes   
7039  Female              0     Yes        Yes      72          Yes   
7040  Female              0     Yes        Yes      11           No   
7041    Male              1     Yes         No       4          Yes   
7042    Male              0      No         No      66          Yes   

         MultipleLines InternetService OnlineSecurity OnlineBackup  \
0     No phone service             DSL             No          Yes   
1                   No             DSL            Yes           No   
2                   No             DSL            Yes          Yes   
3     No phone service             DSL            Yes           No   
4                   No     Fiber optic             No           No   
...                ...             ...            ...          ...   
7038               Yes             DSL            Yes           No   
7039               Yes     Fiber optic             No          Yes   
7040  No phone service             DSL            Yes           No   
7041               Yes     Fiber optic             No           No   
7042                No     Fiber optic            Yes           No   

     DeviceProtection TechSupport StreamingTV StreamingMovies        Contract  \
0                  No          No          No              No  Month-to-month   
1                 Yes          No          No              No        One year   
2                  No          No          No              No  Month-to-month   
3                 Yes         Yes          No              No        One year   
4                  No          No          No              No  Month-to-month   
...               ...         ...         ...             ...             ...   
7038              Yes         Yes         Yes             Yes        One year   
7039              Yes          No         Yes             Yes        One year   
7040               No          No          No              No  Month-to-month   
7041               No          No          No              No  Month-to-month   
7042              Yes         Yes         Yes             Yes        Two year   

     PaperlessBilling              PaymentMethod  MonthlyCharges  \
0                 Yes           Electronic check           29.85   
1                  No               Mailed check           56.95   
2                 Yes               Mailed check           53.85   
3                  No  Bank transfer (automatic)           42.30   
4                 Yes           Electronic check           70.70   
...               ...                        ...             ...   
7038              Yes               Mailed check           84.80   
7039              Yes    Credit card (automatic)          103.20   
7040              Yes           Electronic check           29.60   
7041              Yes               Mailed check           74.40   
7042              Yes  Bank transfer (automatic)          105.65   

      TotalCharges  Churn  
0             30.0      0  
1           1890.0      0  
2            108.0      1  
3           1841.0      0  
4            152.0      1  
...            ...    ...  
7038        1990.0      0  
7039        7363.0      0  
7040         346.0      0  
7041         307.0      1  
7042        6844.0      0  

[7043 rows x 20 columns]
      gender  SeniorCitizen Partner Dependents  tenure PhoneService  \
0     Female              0     Yes         No       1           No   
1       Male              0      No         No      34          Yes   
2       Male              0      No         No       2          Yes   
3       Male              0      No         No      45           No   
4     Female              0      No         No       2          Yes   
...      ...            ...     ...        ...     ...          ...   
7038    Male              0     Yes        Yes      24          Yes   
7039  Female              0     Yes        Yes      72          Yes   
7040  Female              0     Yes        Yes      11           No   
7041    Male              1     Yes         No       4          Yes   
7042    Male              0      No         No      66          Yes   

         MultipleLines InternetService OnlineSecurity OnlineBackup  \
0     No phone service             DSL             No          Yes   
1                   No             DSL            Yes           No   
2                   No             DSL            Yes          Yes   
3     No phone service             DSL            Yes           No   
4                   No     Fiber optic             No           No   
...                ...             ...            ...          ...   
7038               Yes             DSL            Yes           No   
7039               Yes     Fiber optic             No          Yes   
7040  No phone service             DSL            Yes           No   
7041               Yes     Fiber optic             No           No   
7042                No     Fiber optic            Yes           No   

     DeviceProtection TechSupport StreamingTV StreamingMovies        Contract  \
0                  No          No          No              No  Month-to-month   
1                 Yes          No          No              No        One year   
2                  No          No          No              No  Month-to-month   
3                 Yes         Yes          No              No        One year   
4                  No          No          No              No  Month-to-month   
...               ...         ...         ...             ...             ...   
7038              Yes         Yes         Yes             Yes        One year   
7039              Yes          No         Yes             Yes        One year   
7040               No          No          No              No  Month-to-month   
7041               No          No          No              No  Month-to-month   
7042              Yes         Yes         Yes             Yes        Two year   

     PaperlessBilling              PaymentMethod  MonthlyCharges  \
0                 Yes           Electronic check           29.85   
1                  No               Mailed check           56.95   
2                 Yes               Mailed check           53.85   
3                  No  Bank transfer (automatic)           42.30   
4                 Yes           Electronic check           70.70   
...               ...                        ...             ...   
7038              Yes               Mailed check           84.80   
7039              Yes    Credit card (automatic)          103.20   
7040              Yes           Electronic check           29.60   
7041              Yes               Mailed check           74.40   
7042              Yes  Bank transfer (automatic)          105.65   

      TotalCharges  Churn  
0             30.0      0  
1           1890.0      0  
2            108.0      1  
3           1841.0      0  
4            152.0      1  
...            ...    ...  
7038        1990.0      0  
7039        7363.0      0  
7040         346.0      0  
7041         307.0      1  
7042        6844.0      0  

[7043 rows x 20 columns]
      gender  SeniorCitizen Partner Dependents  tenure PhoneService  \
0     Female              0     Yes         No       1           No   
1       Male              0      No         No      34          Yes   
2       Male              0      No         No       2          Yes   
3       Male              0      No         No      45           No   
4     Female              0      No         No       2          Yes   
...      ...            ...     ...        ...     ...          ...   
7038    Male              0     Yes        Yes      24          Yes   
7039  Female              0     Yes        Yes      72          Yes   
7040  Female              0     Yes        Yes      11           No   
7041    Male              1     Yes         No       4          Yes   
7042    Male              0      No         No      66          Yes   

         MultipleLines InternetService OnlineSecurity OnlineBackup  \
0     No phone service             DSL             No          Yes   
1                   No             DSL            Yes           No   
2                   No             DSL            Yes          Yes   
3     No phone service             DSL            Yes           No   
4                   No     Fiber optic             No           No   
...                ...             ...            ...          ...   
7038               Yes             DSL            Yes           No   
7039               Yes     Fiber optic             No          Yes   
7040  No phone service             DSL            Yes           No   
7041               Yes     Fiber optic             No           No   
7042                No     Fiber optic            Yes           No   

     DeviceProtection TechSupport StreamingTV StreamingMovies        Contract  \
0                  No          No          No              No  Month-to-month   
1                 Yes          No          No              No        One year   
2                  No          No          No              No  Month-to-month   
3                 Yes         Yes          No              No        One year   
4                  No          No          No              No  Month-to-month   
...               ...         ...         ...             ...             ...   
7038              Yes         Yes         Yes             Yes        One year   
7039              Yes          No         Yes             Yes        One year   
7040               No          No          No              No  Month-to-month   
7041               No          No          No              No  Month-to-month   
7042              Yes         Yes         Yes             Yes        Two year   

     PaperlessBilling              PaymentMethod  MonthlyCharges  \
0                 Yes           Electronic check           29.85   
1                  No               Mailed check           56.95   
2                 Yes               Mailed check           53.85   
3                  No  Bank transfer (automatic)           42.30   
4                 Yes           Electronic check           70.70   
...               ...                        ...             ...   
7038              Yes               Mailed check           84.80   
7039              Yes    Credit card (automatic)          103.20   
7040              Yes           Electronic check           29.60   
7041              Yes               Mailed check           74.40   
7042              Yes  Bank transfer (automatic)          105.65   

      TotalCharges  Churn  
0             30.0      0  
1           1890.0      0  
2            108.0      1  
3           1841.0      0  
4            152.0      1  
...            ...    ...  
7038        1990.0      0  
7039        7363.0      0  
7040         346.0      0  
7041         307.0      1  
7042        6844.0      0  

[7043 rows x 20 columns]
In [32]:
data2['MultipleLines']= data2['MultipleLines'].replace('No phone service','No')
In [33]:
data2[['OnlineSecurity','OnlineBackup','DeviceProtection','TechSupport','StreamingTV','StreamingMovies']]= data2[['OnlineSecurity','OnlineBackup','DeviceProtection','TechSupport','StreamingTV','StreamingMovies']].replace('No internet service','No')

Churn by gender¶

In [34]:
fig = px.histogram(data2, x="gender", color="Churn",width=500, height=600, color_discrete_map={
        0: '#553a99',
       1: '#d52685'
    })
fig.show()

There is little variation between male and female churn rates.

Churn by Partner¶

In [35]:
fig = px.histogram(data2, x="Partner", color="Churn",width=500, height=600, color_discrete_map={
        0: '#553a99',
       1: '#d52685'
    })
fig.show()

Single customers are nearly 1.7 times as likely to leave than those with a partner.

Churn by dependants¶

In [36]:
fig = px.histogram(data2, x="Dependents", color="Churn",width=500, height=600, color_discrete_map={
        0: '#553a99',
       1: '#d52685'
    })
fig.show()

Customers without dependents are roughly 2.03 times more likely to leave than those with dependents.

Churn by Phone Service¶

In [37]:
fig = px.histogram(data2, x="PhoneService", color="Churn",width=500, height=600, color_discrete_map={
        0: '#553a99',
       1: '#d52685'
    })
fig.show()

The difference in churn rate between customers who have home phone service with the provider and those who do not is negligible.

Churn by Multiple Lines¶

In [38]:
fig = px.histogram(data2, x="MultipleLines", color="Churn",width=500, height=600, color_discrete_map={
        0: '#553a99',
       1: '#d52685'
    })
fig.show()

The difference in churn rate between customers who have multiple lines of phone service with the provider and those who do not is minimal.

Churn by Internet Service¶

In [39]:
fig = px.histogram(data2, x="InternetService", color="Churn",width=500, height=600, color_discrete_map={
        0: '#553a99',
       1: '#d52685'
    })
fig.show()

Those with fibre optic internet connection are 5.66 times more likely to churn than customers without internet service.

Churn by Onine Security¶

In [40]:
fig = px.histogram(data2, x="OnlineSecurity", color="Churn",width=500, height=600, color_discrete_map={
        0: '#553a99',
       1: '#d52685'
    })
fig.show()

A client with an online security service provided by the firm is nearly 2.14 times less likely to abandon the company than a consumer without such a service.

Churn by Device Protection¶

In [41]:
fig = px.histogram(data2, x="DeviceProtection", color="Churn",width=500, height=600, color_discrete_map={
        0: '#553a99',
       1: '#d52685'
    })
fig.show()

A client with a device protection service from the firm is about 1.27 times less likely to depart than a customer without such a service.

Churn by Tech Support¶

In [42]:
fig = px.histogram(data2, x="TechSupport", color="Churn",width=500, height=600, color_discrete_map={
        0: '#553a99',
       1: '#d52685'
    })
fig.show()

A customer with a Tech Support service with the company almost 2.06 times less likely to leave the company than a customer without any a Tech Support service with the company.

Churn by Streaming TV¶

In [43]:
fig = px.histogram(data2, x="StreamingTV", color="Churn",width=500, height=600, color_discrete_map={
        0: '#553a99',
       1: '#d52685'
    })
fig.show()

A client with a Streaming TV service from the firm is 1.24 times more likely to depart than a customer without such a service.

In [44]:
fig = px.histogram(data2, x="StreamingMovies", color="Churn",width=500, height=600, color_discrete_map={
        0: '#553a99',
       1: '#d52685'
    })
fig.show()

A client with a Streaming Movies service from the firm is about 1.23 times more likely to depart than a customer without a Streaming movies service from the company.

Churn by Contract¶

In [45]:
fig = px.histogram(data2, x="Contract", color="Churn",width=500, height=600, color_discrete_map={
        0: '#553a99',
       1: '#d52685'
    })
fig.show()

On the basis of the contract, the histogram and mean differences exhibited significant discrepancies.Customers with a two-year contract are about 15,1 times less likely to churn than those with a monthly plan. In contrast, customers with annual contracts are 3.79 times less likely to churn than those with monthly contracts.

In [46]:
fig = px.histogram(data2, x="PaperlessBilling", color="Churn",width=500, height=600, color_discrete_map={
        0: '#553a99',
       1: '#d52685'
    })
fig.show()

A consumer who receives paperless invoicing from the organisation is nearly 2.06 times more likely to depart than a customer who does not.

In [47]:
fig = px.histogram(data2, x="PaymentMethod", color="Churn",width=500, height=600, color_discrete_map={
        0: '#553a99',
       1: '#d52685'
    })
fig.show()

Almost half of clients whose payment option is an Electronic Check abandon their purchases.

Data Manipulation¶

test and train¶

In [48]:
X= data2.drop('Churn', axis=1)
y= data2['Churn']
categorical_features_indices = np.where(X.dtypes != np.float)[0]

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

Catboost Classification¶

The model that i chose to employ is catboost classification. Catboost came from Cat = Category and Boost from Boosting. CatBoost uses gradient enhanced decision trees as its foundation. During training, successive decision trees are constructed. Each succeeding tree is constructed with less loss than its predecessors. The quantity of trees is determined by the initial settings.

in this step, we mainly focus on recall score. The recall is the ratio true positive / (true positive + fault negative) where true positive is the number of true positives and fn the number of false negatives. The recall is, obviously, the classifier's capacity to identify all positive samples.

In [49]:
accuracy= []
recall =[]
roc_auc= []
precision = []
model_names =[]

catboost_base = CatBoostClassifier(verbose=False,random_state=0)

catboost_base.fit(X_train, y_train,cat_features=categorical_features_indices,eval_set=(X_test, y_test))
y_pred = catboost_base.predict(X_test)

accuracy.append(round(accuracy_score(y_test, y_pred),4))
recall.append(round(recall_score(y_test, y_pred),4))
roc_auc.append(round(roc_auc_score(y_test, y_pred),4))
precision.append(round(precision_score(y_test, y_pred),4))

model_names = ['Catboost']
result_cat = pd.DataFrame({'Accuracy':accuracy,'Recall':recall, 'Roc_Auc':roc_auc, 'Precision':precision}, index=model_names)
result_cat
Out[49]:
Accuracy Recall Roc_Auc Precision
Catboost 0.8012 0.5105 0.7101 0.6782
In [50]:
fig, ax = plt.subplots(figsize=(5, 5))
plot_confusion_matrix(catboost_base, X_test, y_test, cmap=plt.cm.Greens, ax=ax);
In [51]:
accuracy= []
recall =[]
roc_auc= []
precision = []
model_names =[]

catboost_scale3 = CatBoostClassifier(verbose=False,random_state=0, scale_pos_weight=3)

catboost_scale3.fit(X_train, y_train,cat_features=categorical_features_indices,eval_set=(X_test, y_test))
y_pred = catboost_scale3.predict(X_test)

accuracy.append(round(accuracy_score(y_test, y_pred),4))
recall.append(round(recall_score(y_test, y_pred),4))
roc_auc.append(round(roc_auc_score(y_test, y_pred),4))
precision.append(round(precision_score(y_test, y_pred),4))

model_names = ['Catboost_scale3']
result_cat3 = pd.DataFrame({'Accuracy':accuracy,'Recall':recall, 'Roc_Auc':roc_auc, 'Precision':precision}, index=model_names)
result_cat3
Out[51]:
Accuracy Recall Roc_Auc Precision
Catboost_scale3 0.7582 0.8345 0.7821 0.5352
In [52]:
fig, ax = plt.subplots(figsize=(5, 5))
plot_confusion_matrix(catboost_scale3, X_test, y_test, cmap=plt.cm.Greens, ax=ax);
In [53]:
accuracy= []
recall =[]
roc_auc= []
precision = []
model_names =[]

catboost_scale5 = CatBoostClassifier(verbose=False,random_state=0, scale_pos_weight=5)

catboost_scale5.fit(X_train, y_train,cat_features=categorical_features_indices,eval_set=(X_test, y_test))
y_pred = catboost_scale5.predict(X_test)

accuracy.append(round(accuracy_score(y_test, y_pred),4))
recall.append(round(recall_score(y_test, y_pred),4))
roc_auc.append(round(roc_auc_score(y_test, y_pred),4))
precision.append(round(precision_score(y_test, y_pred),4))

model_names = ['Catboost_scale5']
result_cat5 = pd.DataFrame({'Accuracy':accuracy,'Recall':recall, 'Roc_Auc':roc_auc, 'Precision':precision}, index=model_names)
result_cat5
Out[53]:
Accuracy Recall Roc_Auc Precision
Catboost_scale5 0.6947 0.9111 0.7626 0.4682
In [54]:
fig, ax = plt.subplots(figsize=(5, 5))
plot_confusion_matrix(catboost_scale5, X_test, y_test, cmap=plt.cm.Greens, ax=ax);
In [55]:
result_catboost= pd.concat([result_cat,result_cat3,result_cat5],axis=0)
result_catboost
Out[55]:
Accuracy Recall Roc_Auc Precision
Catboost 0.8012 0.5105 0.7101 0.6782
Catboost_scale3 0.7582 0.8345 0.7821 0.5352
Catboost_scale5 0.6947 0.9111 0.7626 0.4682
In [56]:
result_catboost.sort_values(by=['Recall'], ascending=True,inplace=True)
fig = px.bar(result_catboost, x='Recall', y=result_catboost.index, color_discrete_sequence=px.colors.qualitative.Bold,title='Catboost Model Comparison',height=600,labels={'index':'MODELS'})
fig.show()

Base on Recall catboost with adjust scale at 5 have the hightest in recall score. so i employed catboost model with adjust scale 5 in SHAP for developing model in the following step

SHAP Summary Plot¶

we building model from catboost classification by SHAP

In [57]:
explainercat = shap.TreeExplainer(catboost_scale5)
shap_values_cat_test = explainercat.shap_values(X_test)
shap_values_cat_train = explainercat.shap_values(X_train)
In [147]:
shap.summary_plot(shap_values_cat_train, X_train, plot_type="bar")

as you can see from the barchart, the top 5 reason that customer likely to churn are Contract, Internet service, tenure, Payment method and Paperless Billing

How we engage customer to stay with us?¶

As demonstrated by the above model, the type of contract is the leading cause of customer churn, with month-to-month contracts having the highest proportion compared to other contract types. I believe we should give an unique deal, such as unlimited calling for the first two months, in order to retain their business.

How could we retain more customers?¶

To retain more customers, we should give a deal to attract new ones, such as a free Netflix subscription for the first three months with the purchase of an internet plan. In my opinion, customer service is crucial; if a business develops good customer service, it will be simple to attract new customers. Additionally, they will remain with the organisation longer.

In [ ]:
 
In [ ]:
 
In [ ]:
 
In [ ]: